AITopics | technical accuracy

Collaborating Authors

technical accuracy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Accuracy is Not Agreement: Expert-Aligned Evaluation of Crash Narrative Classification Models

Bhagat, Sudesh Ramesh, Shihab, Ibne Farabi, Sharma, Anuj

arXiv.org Artificial IntelligenceSep-30-2025

This study investigates the relationship between deep learning (DL) model accuracy and expert agreement in classifying crash narratives. We evaluate five DL models -- including BERT variants, USE, and a zero-shot classifier -- against expert labels and narratives, and extend the analysis to four large language models (LLMs): GPT-4, LLaMA 3, Qwen, and Claude. Our findings reveal an inverse relationship: models with higher technical accuracy often show lower agreement with human experts, while LLMs demonstrate stronger expert alignment despite lower accuracy. We use Cohen's Kappa and Principal Component Analysis (PCA) to quantify and visualize model-expert agreement, and employ SHAP analysis to explain misclassifications. Results show that expert-aligned models rely more on contextual and temporal cues than location-specific keywords. These findings suggest that accuracy alone is insufficient for safety-critical NLP tasks. We argue for incorporating expert agreement into model evaluation frameworks and highlight the potential of LLMs as interpretable tools in crash analysis pipelines.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.13068

Country: North America > United States > Iowa (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Ground > Road (1.00)
Government (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluating Generative AI-Enhanced Content: A Conceptual Framework Using Qualitative, Quantitative, and Mixed-Methods Approaches

Sarraf, Saman

arXiv.org Artificial IntelligenceNov-26-2024

Generative AI (GenAI) has revolutionized content generation, offering transformative capabilities for improving language coherence, readability, and overall quality. This manuscript explores the application of qualitative, quantitative, and mixed-methods research approaches to evaluate the performance of GenAI models in enhancing scientific writing. Using a hypothetical use case involving a collaborative medical imaging manuscript, we demonstrate how each method provides unique insights into the impact of GenAI. Qualitative methods gather in-depth feedback from expert reviewers, analyzing their responses using thematic analysis tools to capture nuanced improvements and identify limitations. Quantitative approaches employ automated metrics such as BLEU, ROUGE, and readability scores, as well as user surveys, to objectively measure improvements in coherence, fluency, and structure. Mixed-methods research integrates these strengths, combining statistical evaluations with detailed qualitative insights to provide a comprehensive assessment. These research methods enable quantifying improvement levels in GenAI-generated content, addressing critical aspects of linguistic quality and technical accuracy. They also offer a robust framework for benchmarking GenAI tools against traditional editing processes, ensuring the reliability and effectiveness of these technologies. By leveraging these methodologies, researchers can evaluate the performance boost driven by GenAI, refine its applications, and guide its responsible adoption in high-stakes domains like healthcare and scientific research. This work underscores the importance of rigorous evaluation frameworks for advancing trust and innovation in GenAI.

machine learning, natural language, saman sarraf, (19 more...)

arXiv.org Artificial Intelligence

2411.17943

Country:

North America > United States > California > Santa Clara County > Santa Clara (0.04)
North America > United States > Arkansas (0.04)

Genre: Questionnaire & Opinion Survey (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.90)
Health & Medicine > Diagnostic Medicine > Imaging (0.89)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.86)

Add feedback

Unifying data and AI terms for all - ITU Hub

#artificialintelligenceApr-30-2022, 14:35:15 GMT

The world is witnessing rapid technological advances in the fields of data science and artificial intelligence (AI). From helping fight climate change to addressing all the other sustainable development goals of the United Nations, valuable use cases show how cutting-edge data and AI applications can improve our daily lives. At the same time, public awareness initiatives are still behind the curve, leaving many people feeling ambivalent about AI. Moreover, for non-technical readers, disparate definitions of data and AI terms can impede easy understanding of these dynamic fields. Despite global summits, educational publications, and ample media coverage, the fields of AI and data science stand to benefit from an agreed set of accessible definitions and terminologies.

technical accuracy, term and definition, unifying data and ai term, (5 more...)

#artificialintelligence

Industry: Government (0.37)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback